exact value
Mathematicians spent 2025 exploring the edge of mathematics
In 2025, the edges of mathematics came a little more sharply into view when members of the online Busy Beaver Challenge community closed in on a huge number that threatens to defy the logical underpinnings of the subject. This number is the next in the "Busy Beaver" sequence, a series of ever-larger numbers that emerges from a seemingly simple question - how do we know if a computer program will run forever? To find out, researchers turn to the work of mathematician Alan Turing, who showed that any computer algorithm can be mimicked by imagining a simplified device called a Turing machine. More complex algorithms correspond to Turing machines with larger sets of instructions or, in mathematical parlance, more states. For example BB(1) is 1 and BB(2) is 6, so making the algorithm twice as complex increases its runtime sixfold.
Mathematicians are chasing a number that may reveal the edge of maths
Amateur mathematicians are closing in on an unimaginably huge number – one so large that it brushes up on the edge of what is even knowable within the framework of modern mathematics. It all stems from a seemingly simple question: how do you know if a computer program will run forever? Answering this starts with mathematician Alan Turing. In the 1930s, he showed that any computer algorithm can be mimicked by imagining a simple "Turing machine" that reads and writes 0s and 1s on an infinitely long tape by following a set of instructions called states, with more complex algorithms requiring more states. For every number of states, such as 5 or 100, there are finitely many corresponding Turing machines, but it is unclear for how long each of these machines must run.
Learning from Complementary Features
Sugiyama, Kosuke, Uchida, Masato
While precise data observation is essential for the learning processes of predictive models, it can be challenging owing to factors such as insufficient observation accuracy, high collection costs, and privacy constraints. In this paper, we examines cases where some qualitative features are unavailable as precise information indicating "what it is," but rather as complementary information indicating "what it is not." We refer to features defined by precise information as ordinary features (OFs) and those defined by complementary information as complementary features (CFs). We then formulate a new learning scenario termed Complementary Feature Learning (CFL), where predictive models are constructed using instances consisting of OFs and CFs. The simplest formalization of CFL applies conventional supervised learning directly using the observed values of CFs. However, this approach does not resolve the ambiguity associated with CFs, making learning challenging and complicating the interpretation of the predictive model's specific predictions. Therefore, we derive an objective function from an information-theoretic perspective to estimate the OF values corresponding to CFs and to predict output labels based on these estimations. Based on this objective function, we propose a theoretically guaranteed graph-based estimation method along with its practical approximation, for estimating OF values corresponding to CFs. The results of numerical experiments conducted with real-world data demonstrate that our proposed method effectively estimates OF values corresponding to CFs and predicts output labels.
- North America > United States > New York (0.04)
- Europe > Germany > Bavaria > Lower Franconia > Würzburg (0.04)
- Asia > Japan (0.04)
Forward-PECVaR Algorithm: Exact Evaluation for CVaR SSPs
Reis, Willy Arthur Silva, Pais, Denis Benevolo, Freire, Valdinei, Delgado, Karina Valdivia
The Stochastic Shortest Path (SSP) problem models probabilistic sequential-decision problems where an agent must pursue a goal while minimizing a cost function. Because of the probabilistic dynamics, it is desired to have a cost function that considers risk. Conditional Value at Risk (CVaR) is a criterion that allows modeling an arbitrary level of risk by considering the expectation of a fraction $\alpha$ of worse trajectories. Although an optimal policy is non-Markovian, solutions of CVaR-SSP can be found approximately with Value Iteration based algorithms such as CVaR Value Iteration with Linear Interpolation (CVaRVIQ) and CVaR Value Iteration via Quantile Representation (CVaRVILI). These type of solutions depends on the algorithm's parameters such as the number of atoms and $\alpha_0$ (the minimum $\alpha$). To compare the policies returned by these algorithms, we need a way to exactly evaluate stationary policies of CVaR-SSPs. Although there is an algorithm that evaluates these policies, this only works on problems with uniform costs. In this paper, we propose a new algorithm, Forward-PECVaR (ForPECVaR), that evaluates exactly stationary policies of CVaR-SSPs with non-uniform costs. We evaluate empirically CVaR Value Iteration algorithms that found solutions approximately regarding their quality compared with the exact solution, and the influence of the algorithm parameters in the quality and scalability of the solutions. Experiments in two domains show that it is important to use an $\alpha_0$ smaller than the $\alpha$ target and an adequate number of atoms to obtain a good approximation.
- South America > Brazil > São Paulo (0.04)
- North America > United States > Texas > Travis County > Austin (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- (3 more...)
Photo Mosaics with Nearest Neighbors: Machine Learning for Digital Art
Technological innovation is increasing at a rapid pace and has made digital storage extremely cheap and accessible. Additionally, most people now have phones with cameras that are able to capture high quality images. The majority of images taken are viewed a few times and then sent to sit on a hard drive or some cloud storage service. I am no different, and since I had some extra time during the COVID-19 lockdowns, I came up with some software to give the photos in people's libraries a second life. This software creates photo mosaics.
The Exact Asymptotic Form of Bayesian Generalization Error in Latent Dirichlet Allocation
It is applied to knowledge discovery via dimension reducing and clustering in many fields. However, its generalization error had not been yet clarified since it is a singular statistical model where there is no one to one map from parameters to probability distributions. In this paper, we give the exact asymptotic form of its generalization error and marginal likelihood, by theoretical analysis of its learning coefficient using algebraic geometry. The theoretical result shows that the Bayesian generalization error in LDA is expressed in terms of that in matrix factorization and a penalty from the simplex restriction of LDA's parameter region.
- North America > United States > District of Columbia > Washington (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > Middle East > Jordan (0.04)
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
- (2 more...)
Efficient Evaluation of the Partition Function of RBMs with Annealed Importance Sampling
Mazzanti, Ferran, Romero, Enrique
Probabilistic models based on Restricted Boltzmann Machines (RBMs) imply the evaluation of normalized Boltzmann factors, which in turn require from the evaluation of the partition function Z. The exact evaluation of Z, though, becomes a forbiddingly expensive task as the system size increases. This even worsens when one considers most usual learning algorithms for RBMs, where the exact evaluation of the gradient of the log-likelihood of the empirical distribution of the data includes the computation of Z at each iteration. The Annealed Importance Sampling (AIS) method provides a tool to stochastically estimate the partition function of the system. So far, the standard use of the AIS algorithm in the Machine Learning context has been done using a large number of Monte Carlo steps. In this work we show that this may not be required if a proper starting probability distribution is employed as the initialization of the AIS algorithm. We analyze the performance of AIS in both small- and large-sized problems, and show that in both cases a good estimation of Z can be obtained with little computational cost.
- North America > Canada > Ontario > Toronto (0.14)
- Europe > Spain (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Upper Bound of Bayesian Generalization Error in Non-negative Matrix Factorization
Hayashi, Naoki, Watanabe, Sumio
Recently, nonnegative matrix factorization (NMF) [1, 2] has been applied to text mining [3], signal processing [4, 5, 6], bioinformatics [7], and consumer analysis [8]. Experiments has shown that a new knowledge discovery method is derived by NMF, however, its mathematical property as a learning machine is not yet clarified, since it is not a regular statistical model. A statistical model is called regular if a function from a parameter to a probability density function is one-to-one and if the likelihood function can be approximated by a Gaussian function. It is proved that, if a statistical model is regular and if a true distribution is realizable by a statistical model, then the generalization error is asymptotically equal to d/(2n), where d, n, and the generalization error are the dimension of the parameter, the sample size, and the expected Kullback-Leibler divergence of the true distribution and the estimated learning machine, respectively. However, the statistical model used in NMF is not regular because the map from a parameter to a probability density function is not injective.
- North America > United States (0.04)
- Asia > Japan (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.69)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)
Precisely Verifying the Null Space Conditions in Compressed Sensing: A Sandwiching Algorithm
In this paper, we propose new efficient algorithms to verify the null space condition in compressed sensing (CS). Given an $(n-m) \times n$ ($m>0$) CS matrix $A$ and a positive $k$, we are interested in computing $\displaystyle \alpha_k = \max_{\{z: Az=0,z\neq 0\}}\max_{\{K: |K|\leq k\}}$ ${\|z_K \|_{1}}{\|z\|_{1}}$, where $K$ represents subsets of $\{1,2,...,n\}$, and $|K|$ is the cardinality of $K$. In particular, we are interested in finding the maximum $k$ such that $\alpha_k < {1}{2}$. However, computing $\alpha_k$ is known to be extremely challenging. In this paper, we first propose a series of new polynomial-time algorithms to compute upper bounds on $\alpha_k$. Based on these new polynomial-time algorithms, we further design a new sandwiching algorithm, to compute the \emph{exact} $\alpha_k$ with greatly reduced complexity. When needed, this new sandwiching algorithm also achieves a smooth tradeoff between computational complexity and result accuracy. Empirical results show the performance improvements of our algorithm over existing known methods; and our algorithm outputs precise values of $\alpha_k$, with much lower complexity than exhaustive search.